- 
                Notifications
    You must be signed in to change notification settings 
- Fork 28.9k
[SPARK-5009] [SQL] Long keyword support in SQL Parsers #3926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Test build #25146 has started for   PR 3926 at commit  
 | 
| LGTM. | 
| @OopsOutOfMemory Yea, I will do that after #3924 merged. :) | 
| Test build #25146 has finished for   PR 3926 at commit  
 | 
| Test PASSed. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not lazy val?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right, will fix this.
| Test build #25157 has started for   PR 3926 at commit  
 | 
| Test build #25157 has finished for   PR 3926 at commit  
 | 
| Test PASSed. | 
01ff9c6    to
    c620afa      
    Compare
  
    | Test build #25210 has started for   PR 3926 at commit  
 | 
| Test build #25210 has finished for   PR 3926 at commit  
 | 
| Test FAILed. | 
c620afa    to
    dd0e60a      
    Compare
  
    | Test build #25212 has started for   PR 3926 at commit  
 | 
| Test build #25212 has finished for   PR 3926 at commit  
 | 
| Test PASSed. | 
| Removed the WIP, and updated the description. | 
| Test build #25228 has started for   PR 3926 at commit  
 | 
| Test build #25228 has finished for   PR 3926 at commit  
 | 
| Test PASSed. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chenghao-intel
May
def apply(input: String): LogicalPlan 
to be
def apply(input: String): Option[LogicalPlan]
?
It's not consistent with DDLParser .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, we probably can combine couple of Parser work in delegation mode, but currently, I just simply wrote another version of the def apply in DDLParser.
| This is awesome, thanks for cleaning this up. One question though, do we really want to have case insensitive keywords? Are there any systems that actually do that? If it is something we want to keep then maybe you can add some documentation to the normalizer classes. | 
| BTW, I'm going to try to merge #3431 first, which might conflict with this. | 
536e592    to
    080410a      
    Compare
  
    | Test build #25385 has started for   PR 3926 at commit  
 | 
| @marmbrus You're right, SQL keywords should always be case insensitive. I've updated the code for this and rebased to the latest master. | 
| Test build #25385 has finished for   PR 3926 at commit  
 | 
| Test FAILed. | 
| Test build #25390 has started for   PR 3926 at commit  
 | 
| Test build #25390 has finished for   PR 3926 at commit  
 | 
| Test PASSed. | 
4828f46    to
    f3c0abc      
    Compare
  
    | Test build #25511 has started for   PR 3926 at commit  
 | 
f3c0abc    to
    686660f      
    Compare
  
    | Test build #25513 has started for   PR 3926 at commit  
 | 
| Test build #25513 has finished for   PR 3926 at commit  
 | 
| Test PASSed. | 
| Test build #25511 has finished for   PR 3926 at commit  
 | 
| Test PASSed. | 
| retest this please | 
| Test build #25810 has started for   PR 3926 at commit  
 | 
| Test build #25810 has finished for   PR 3926 at commit  
 | 
| Test PASSed. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is kind of a nit, but since this is only used in AbstractSparkSQLParser and its subclasses I'd just make it a protected method to avoid the syntatic overhead of a whole separate object. I believe you are doing further refactoring so maybe that can be done in a followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also used withinSqlLexical.processIdent, but you're right, we'd better keep it the minimize visibility. I will do that in #4015 .
| Thanks for doing this, much better than the previous hack! Merged to master. | 
* The `SqlLexical.allCaseVersions` will cause `StackOverflowException` if the key word is too long, the patch will fix that by normalizing all of the keywords in `SqlLexical`. * And make a unified SparkSQLParser for sharing the common code. Author: Cheng Hao <[email protected]> Closes apache#3926 from chenghao-intel/long_keyword and squashes the following commits: 686660f [Cheng Hao] Support Long Keyword and Refactor the SQLParsers
SqlLexical.allCaseVersionswill causeStackOverflowExceptionif the key word is too long, the patch will fix that by normalizing all of the keywords inSqlLexical.